Wan-optimized local and cloud spanning deduplicated storage system

ABSTRACT

A spanning storage interface facilitates the use of cloud storage services by storage clients. The spanning storage interface presents one or more data interfaces to storage clients at a network location, such as file, object, data backup, archival, and storage block based interfaces. The data interfaces allows storage clients to store and retrieve data using non-cloud based protocols. The spanning storage interface may perform data deduplication on data received from storage clients. The spanning storage interface may transfer the deduplicated version of the data to the cloud storage service. The spanning storage interface may include local storage for storing a copy or all or a portion of the data from storage clients. The local storage may be used as a local cache of frequently accessed data, which may be stored data in its deduplicated form.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Patent ApplicationNo. 61/315,392, filed Mar. 18, 2010 and entitled “WAN-OPTIMIZED LOCALAND CLOUD SPANNING DEDUPLICATED STORAGE SYSTEM” and to U.S. ProvisionalPatent Application No. 61/290,334, filed Dec. 28, 2009 and entitled“DEDUPLICATED OBJECT STORAGE SYSTEM AND APPLICATIONS,” which areincorporated by reference herein for all purposes. This application isrelated to U.S. patent application Ser. No. ______ [Docket NumberR001520US], filed ______, and entitled “DISASTER RECOVERY USING LOCALAND CLOUD SPANNING DEDUPLICATED STORAGE SYSTEM,” which is incorporatedby reference herein for all purposes.

BACKGROUND OF THE INVENTION

The present invention relates generally to data storage systems, andsystems and methods to improve storage efficiency, compactness,performance, reliability, and compatibility. In general, data storagesystems receive and store all or portions of arbitrary sets or stream ofdata. Data storage systems also retrieve all or portions of arbitrarysets or streams of data. A data storage system provides data storage andretrieval to one or more storage clients, such as user and servercomputers. Stored data may be referenced by unique identifiers and/oraddresses or indices. In some implementations, the data storage systemuses a file system to organize data sets into files. Files may beidentified and accessed by a file system path, which may include a filename and one or more hierarchical file system directories.

Many data storage systems are tasked with handling enormous amounts ofdata. Additionally, data storage systems often provide data access tolarge numbers of simultaneous users and software applications. Users andsoftware applications may access the file system via localcommunications connections, such as a high-speed data bus within asingle computer; local area network connections, such as an Ethernetnetworking or storage area network (SAN) connection; and wide areanetwork connections, such as the Internet, cellular data networks, andother low-bandwidth, high-latency data communications networks.

Cloud storage services are one type of data storage available via awide-area network. Cloud storage services provide storage to users inthe form of a virtualized storage device available via the Internet. Ingeneral, users access cloud storage to store and retrieve data using webservices protocols, such as REST or SOAP. Cloud storage serviceproviders manage the operation and maintenance of the physical datastorage devices. Users of cloud storage can avoid the initial andongoing costs associated with buying and maintaining storage devices.Cloud storage services typically charge users for consumption of storageresources, such as storage space and/or transfer bandwidth, on amarginal or subscription basis, with little or no upfront costs. Inaddition to the cost and administrative advantages, cloud storageservices often provide dynamically scalable capacity to meet its userschanging needs.

The term “data deduplication” refers to some process of eliminatingredundant data for the purposes of storage or communication. Datadeduplicating storage typically compares incoming data with the dataalready stored, and only stores the portions of the incoming data thatdo not match data already stored in the data storage system. Datadeduplicating storage maintains metadata to determine when portions ofdata are no longer in use by any files or other data entities.

The CPU and I/O requirements for supporting an extremely large datadeduplicating storage are significant, and are difficult to satisfythrough vertical scaling of a single device. As a result, prior spanningstorage interface may impose severe throughput, latency, and otherperformance penalties on storage clients. Additionally, performanceconsiderations limit the amount and types of optimizations andcompression applied by prior spanning storage interfaces.

Additionally, prior spanning storage interfaces have difficultyoperating with cloud storage systems. Data deduplication often requiresfrequent comparisons of incoming data with previously-stored data toidentify redundant data. However, cloud data storage is accessible onlyvia a wide-area network, such as the Internet, with significant latencyand bandwidth limitations as compared with local-area and storage-areanetworks. Therefore, prior spanning storage interfaces have poorperformance when used with cloud storage systems.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be described with reference to the drawings, inwhich:

FIG. 1 illustrates an example of spanning storage interface according toan embodiment of the invention;

FIG. 2 illustrates example data structures used by a spanning storageinterface according to an embodiment of the invention;

FIG. 3A-3B illustrates a method of converting a data stream intodeduplicated data according to an embodiment of the invention;

FIG. 4 illustrates a method of retrieving an original data stream fromdeduplicated data according to an embodiment of the invention;

FIG. 5 illustrates a method of deleting a data stream from a spanningstorage interface according to an embodiment of the invention;

FIG. 6 illustrates a computer system suitable for implementingembodiments of the invention; and

FIG. 7 illustrates an example disaster recovery application of aspanning storage interface according to an embodiment of the invention.

SUMMARY

Embodiments of the invention include a spanning storage interfaceadapted to facilitate the use of cloud storage services by storageclients. A spanning storage interface presents one or more datainterfaces to storage clients at a network location. These datainterfaces may include file, object, data backup, archival, and storageblock based interfaces. Each of these data interfaces allows storageclients to store and retrieve data using non-cloud based protocols. Thisallows storage clients to store and retrieve data in the cloud storageservice using their native or built-in functions, rather than having tobe rewritten and/or reconfigured to operate with a cloud storageservice.

To improve performance of the spanning storage interface, an embodimentof the invention performs data deduplication on data received fromstorage clients. Once the received data has been deduplicated, thespanning storage interface may transfer the deduplicated version of thedata to the cloud storage service. By transferring data in deduplicatedform to and from the cloud storage service, these embodiments of theinvention improve storage performance by reducing the time and networkbandwidth required to access data, as well as reducing total amount ofstorage required. If a storage client wishes to access data previouslystored in the cloud storage service, the spanning storage interfaceretrieves the corresponding deduplicated data and reconstructs theoriginal data.

In an embodiment, the spanning storage interface may include localstorage for storing a copy or all or a portion of the data from storageclients. The local storage may be used as a local cache of frequentlyaccessed data. In a further embodiment, the local cache stores data inits deduplicated form.

The spanning storage interface may operated with multiple cloud storageservices to provide storage clients with a range of storage options. Ina further embodiment, the spanning storage interface may send differentportions of the received data to different cloud storage services basedon user specified attributes or criteria, such as all or a portion ofthe file path associated with the received data.

In an embodiment, two or more spanning storage interfaces may be used ina disaster recovery application. Disaster recovery application may beused to provide redundant data access to storage clients in the eventthat the storage clients and/or cloud spanning storage interface at afirst network location are disabled, destroyed, or otherwiseinaccessible or inoperable. A disaster recovery application includes atleast first and second spanning storage interfaces at first and secondnetwork locations. The second spanning storage interface is provided forat least disaster recovery operations. The second spanning storageinterface includes second local storage for improving data accessperformance. A copy of the local cache of the first spanning storageinterface is transferred to the second local storage while the firstnetwork location is operating. In the event of a disaster affecting thefirst network location, the second spanning storage interface canprovide data access to the first network location's data with theimproved performance benefit using the copy of local cache in the secondlocal storage.

Embodiments of the disaster recovery application may use the secondnetwork location as a dedicated disaster recovery network location.Alternatively, the second network location may also optionally be usedwith one or more of its own local storage clients. In this furtherexample, the second spanning storage interface performs datadeduplication and facilitates cloud storage for data from storageclients at the second network location in addition to acting as adisaster recovery system for the first network location. In yet afurther embodiment, the first spanning storage interface may act as adisaster recovery system for the second spanning storage interface, justas the second spanning storage interface may act as a disaster recoverysystem for the first spanning storage interface. This pairing ofspanning storage interfaces for disaster recovery may be extended tothree or more network locations.

DETAILED DESCRIPTION

FIG. 1 illustrates an example of spanning storage interface 100according to an embodiment of the invention. An example installation ofthe spanning storage interface 100 includes one or more client systems105, which may include client computers, server computers, andstandalone network devices. Client systems 105 are connected with aspanning storage interface 125 via a local-area network and/or a storagearea network 115. Cloud storage 175 is connected with the spanningstorage interface 125 by at least a wide-area network 177 and optionallyan additional local area network. Cloud storage 175 includes a cloudstorage interface 180 for communicating with the spanning storageinterface 125 via wide-area network 177 and at least one physical datastorage device 185 for storing data.

Embodiments of spanning storage interface 100 may support a variety ofdifferent storage applications using cloud data storage, includinggeneral data storage, data backup, disaster recovery, and deduplicatedcloud data storage. In the case of general data storage applications, aclient, such as client 105 c, may communicate with the spanning storageinterface 125 via a file system protocol, such as CIFS or NTFS, or ablock-based storage protocol, such as iSCSI or IFCP. Data backup anddisaster recovery applications may also use these protocols or specificbackup and recovery protocols, such as VTL or OST. For backupapplications, a client system 105 a may include a backup agent 110 forinitiating data backups. The backup agent 110 may communicate directlywith the spanning storage interface 125 or a backup server 105 b, whichin spanning storage interface 100 is equivalent to a client. For cloudstorage applications, a client 103 c may communicate with the spanningstorage interface 125 via a web services protocol, such as SOAP or REST.The web services protocol may present a virtualized storage device toclient 103 c. The web services protocol used by clients 105 tocommunicate with the spanning storage interface 125 may be the same ordifferent than the protocol used by the spanning storage interface 125to communicate with the cloud storage 175.

Embodiments of the spanning storage interface 100 may optimize dataaccess to cloud storage 175 in a number of different ways. An embodimentof the spanning storage interface 125 may present clients 105 with afile system, backup device, storage array, or other data storageinterface, while transparently storing and retrieving data using thecloud storage 175 via the wide-area network 177. In a furtherembodiment, the spanning storage interface 125 may perform datadeduplication on data received from clients 105, thereby reducing theamount of storage capacity required in cloud storage 175. Additionally,because the bandwidth of the wide-area network is often limited, datadeduplication by the spanning storage interface 125 increases the dataaccess performance, as perceived by the clients 125. In still a furtherembodiment, the spanning storage interface 125 may locally cache aportion of the clients' data using local storage 170. The locally cacheddata may be accessed rapidly, further improving the perceived dataaccess performance. As described in detail below, the spanning storageinterface 125 may use a variety of different criteria for selecting theportion of the clients' data to cache locally and may locally cache datain a deduplicated form to reduce the required capacity of local storage175.

An embodiment of spanning storage interface 125 includes one or morefront end interfaces 130 for communicating with one or more clientsystems 105. Examples of front end interfaces 130 include a backup frontend interface 130 a, a file system front end interface 130 b, a cloudstorage front end interface 130 c, a file archival front end interface130 d, and a object front end interface 130 e. An example backup frontend interface 130 a enables backup applications, such as a backup agent110 and/or a backup server 105 b, to store and retrieve data to and fromthe cloud storage 175 using data backup and recovery protocols such asVTL or OST. In this example, the backup front end interface 130 a allowsthe spanning storage interface 125 and cloud storage 175 to appear toclients 105 as a backup storage device.

An example file system front end interface 130 b enables clients 105 tostore and retrieve data to and from the cloud data storage 175 using afile system protocol, such as CIFS or NTFS, or a block-based storageprotocol, such as iSCSI or IFCP. In this example, the file system frontend interface 130 b allows the spanning storage interface 125 and cloudstorage 175 to appear to clients 105 as one or more storage devices,such as a CIFS or NTFS storage volume or a iSCSI or FibreChannel logicalunit number (LUN).

An example cloud storage front end interface 130 c enables clients 105to store and retrieve data to and from the cloud data storage 175 usinga cloud storage protocol or API. Typically, cloud storage protocols orAPIs are implemented using a web services protocol, such as SOAP orREST. In this example, the cloud storage front end interface 130 callows the spanning storage interface 125 and cloud storage 175 toappear to clients 105 as one or more cloud storage services. By usingspanning storage interface 125 to provide a cloud storage interface toclients 105, rather than letting clients 105 communicate directly withthe cloud storage 175, the spanning storage interface 125 may performdata deduplication, local caching, and/or translation between differentcloud storage protocols.

An example file archival front end interface 130 d enables clients 105to store and retrieve file archives. Clients 105 may use the spanningstorage interface 125 and the cloud storage 175 to store and retrievefiles or other data in one or more archive files. The file archivalfront end interface 130 d allows clients 105 to store archive filesusing cloud storage 175 using archive file interfaces, rather than acloud storage interface. Additionally, the spanning storage interface125 may perform data deduplication and local caching of the filearchives.

An example object front end interface 130 e enables clients to store andretrieve data in any arbitrary format, such as object formats and blobsor binary large objects. The object front end interface 130 e allowsclients 105 to store data in arbitrary formats, such as object formatsor blobs, using cloud storage 175 using object protocols, such as objectserialization or blob storage protocols, rather than a cloud storageprotocol. Additionally, the spanning storage interface 125 may performdata deduplication and local caching of the object or blob data.

An example block storage protocol front end interface 130 f enablesclients to store and retrieve data using block-based storage protocols,such as iSCSI. In an embodiment, the block storage protocol front endinterface 130 f appears to clients 105 as one or more logical storagevolumes, such as iSCSI LUNs.

In an embodiment, spanning storage interface 125 also includes one ormore shell file systems 145. Shell file system 145 includes arepresentation of the entities, such as files, directories, objects,blobs, and file archives, stored by clients 125 via the front endinterfaces 130. In an embodiment, the shell file system 145 includesentities stored by the clients 125 in a shell form. In this embodiment,each entity, such as a file or other entity, is a represented by a“shell” entity that does not include the data contents of the originalentity. For example, a shell file in the shell file system 145 includesthe same name, file path, and file metadata as the original file.However, the shell file does not include the actual file data, which isstored in the cloud storage 175. It should be noted that although thesize of the shell file is less than the size of the actual stored file(in either its original or deduplicated format, an embodiment of theshell file system 145 sets the file size metadata attribute of the shellfile to the size of the original file. In a further embodiment, eachentity in the shell file system 145, such as a file, directory, object,blob, or file archive, may include additional metadata for use by thespanning storage interface 125 to access the corresponding data from thecloud storage 175.

In an embodiment, storage blocks provided to the spanning storageinterface through the block storage protocol front end interface 130 fmay bypass the shell file system 145. In this embodiment, data receivedby the spanning storage interface in the form of storage blocks aregrouped together, for example in groups of fixed size and in order ofreceipt. Data deduplication is then applied to each group of storageblocks and the resulting deduplicated data is transferred to the cloudstorage service. In this embodiment, the spanning storage interface 125maintains a table or other data structure that associates storage blockaddresses or identifiers with corresponding deduplicated storage data,so that the spanning storage interface 125 can retrieve and reconstructthe appropriate data when a storage client requests access to apreviously stored storage block.

An embodiment of the spanning storage interface 125 includes adeduplication module 150 for deduplicating data received from clients105. Deduplication module 150 analyzes data from clients 105 andcompares incoming data with previously stored data to eliminateredundant data for the purposes of storage or communication. Datadeduplication reduces the amount of storage capacity used by cloudstorage 175 to store clients' data. Also, because wide-area network 177typically has bandwidth limitations, the reduction of data size due todata deduplication also reduces the amount of time required to transferdata between clients 105 and the cloud storage 175. Additionally,deduplication module 150 retrieves deduplicated data from the cloudstorage 175 and converts it back to its original form for use by clients105.

In an embodiment, deduplication module 150 performs data deduplicationon incoming data and temporarily stores this deduplicated data locally,such as on local storage 170. Local storage 170 may be a physicalstorage device connected with or integrated within the spanning storageinterface 125. Local storage 170 is accessed from spanning storageinterface 125 by a local storage interface 160, such as an internal orexternal data storage interface, or via a local-area network.

In an embodiment, the cloud storage 175 includes a complete andauthoritative version of the clients' data. In a further embodiment, thespanning storage interface 125 may maintain local copies of some or allof the clients' data for the purpose of caching. In this embodiment, thespanning storage interface 125 uses the local storage 170 to cacheclient data. The spanning storage interface 125 may cache data in itsdeduplicated format to reduce local storage requirements or increase theeffective cache size. In this embodiment, the spanning storage interface125 may use a variety of criteria for selecting portions of thededuplicated client data for caching. For example, if the spanningstorage interface 125 is used for general file storage or as a cloudstorage interface, the spanning storage interface may select a specificamount or percentage of the client data for local caching. In anotherexample, the data selected for local caching may be based on usagepatterns of client data, such as frequently or recently used data.Caching criteria may be based on elapsed time and/or the type of data.In another example, the spanning storage interface 125 may maintainlocally cached copies of the most recent data backups from clients, suchas the most recent full backup and the previous week's incrementalbackups.

In an embodiment, replication module 155 transfers locally storeddeduplicated data from the spanning storage interface 125 to the cloudstorage 175. Embodiments of the deduplication module and the replicationmodule 155 may operate in parallel and/or asynchronously, so that thebandwidth limitations of wide-area network 177 do not interfere with thethroughput of the deduplication module 150. The operation of embodimentsof deduplication module 150 and replication module 155 are described indetail below.

An embodiment of spanning storage interface 125 includes a cloud storagebackend interface 165 for communicating data between the spanningstorage interface 125 and the cloud storage 175. Embodiments of thecloud storage backend interface 165 may use cloud storage protocols orAPI and/or web services protocols, such as SOAP or REST, to store andretrieve data from the cloud storage 175. In an embodiment, thereplication module transfers deduplicated data from local storage 170 tocloud storage 175 using the cloud storage backend interface 165. In anembodiment, the deduplication module retrieves deduplicated data fromthe cloud storage 175 using the cloud storage backend interface 165.

An embodiment of the spanning storage interface 125 may be configured tooperate with multiple cloud storage services. In an embodiment, thespanning storage interface 125 may transfer all or portions of thedededuplicated data to two or more cloud storage services.

In another embodiment, the spanning storage interface 125 may transferdifferent portions of the deduplicated data to different cloud storageservices, such as transferring a first portion of the deduplicatedstorage data to a first cloud storage service, a second portion of thededuplicated storage data to a second cloud storage service, and soforth.

Different cloud storage services may have different advantages and/ordisadvantages, such as cost, bandwidth, reliability, and replicationpolicies. In this embodiment, a system administrator or other user mayidentify the different portions of data and designate the cloud storageservice to be used to store deduplicated versions of these portions ofthe data, thereby tailoring the usage of different cloud storageservices to data storage needs. The user may identify different portionsof data and associated cloud storage services based on file or objectname, file or object type, file directory or path, contents of the data,and/or any other criteria or attribute of the data, storage client,cloud storage service, or the spanning storage interface 125.

In yet a further embodiment, system administrators or other users mayspecify quotas for cloud storage access based on the total amount ofdata received from storage clients or the amount of deduplicated datatransferred to the one or more cloud storage services. In thisembodiment, if a data transfer exceeds or is anticipated to exceed aspecified quota, the spanning storage interface 125 may abandon thestorage operation and return an error message or other notification tothe storage client. Embodiments may allow users to specify quotas foreach storage client, a group of two or more storage clients, all of thestorage clients at a network location or based on criteria or attributesassociated with the cloud storage service, spanning storage interface,and/or data, such as file or object names, file or object types, filedirectories or paths, contents of the data.

In an embodiment, the spanning storage interface 125 performs datadeduplication by segmenting an incoming data stream to aid datacompression. For example, segmentation may be designed to produce manyidentical segments when the data stream includes redundant data.Multiple instances of redundant data may be represented by referencing asingle copy of this data.

Additionally, a data stream may be segmented based on data types to aiddata compression, such that different data types are in differentsegments. Different data compression techniques may then be applied toeach segment. Data compression may also determine the length of datasegments. For example, data compression may be applied to a data streamuntil segment boundary is reached or the segment including thecompressed data reaches a predetermined size, such as 4 KB. The sizethreshold for compressed data segments may be based on optimizing diskor data storage device access.

Regardless of the technique used to segment data in the data stream, theresult is a segmented data stream having its data represented assegments. In some embodiments of the invention, data segmentation occursin memory and the segmented data stream is not written back to datastorage in this form. Each segment is associated with a label. Labelsare smaller in size than the segments they represent. The segmented datastream is then replaced with deduplicated data in the form of a labelmap and segment storage. Label map includes a sequence of labelscorresponding with the sequence of data segments identified in thesegmented data stream. Segment storage includes copies of the segmentlabels and corresponding segment data. Using the label map and the datasegment storage, a storage system can reconstruct the original datastream by matching in sequence each label in a label map with itscorresponding segment data from the data segment storage. In anembodiment, the deduplication module 150 and/or one or more othermodules of the spanning storage interface 125 reconstruct all or aportion of the original data stream in response to a data access requestfrom a storage client.

Embodiments of the invention attempt (but do not always succeed) inassigning a single label to each unique data segment. Because thesegmentation of the data stream produces many identical segments whenthe data stream includes redundant data, these embodiments allow asingle label and one copy of the corresponding segment data to representmany instances of this segment data at multiple locations in the datastream. For example, a label map may include multiple instances of agiven label at different locations. Each instance of this labelrepresents an instance of the corresponding segment data. Because thelabel is smaller than the corresponding segment data, representingredundant segment data using multiple instances of the same labelresults in a substantial size reduction of the data stream.

FIGS. 2, 3A-3B, 4, and 5 illustrate the operation of the deduplicationmodule 150 and the replication module 155 according to an embodiment ofthe invention. FIG. 2 illustrates example data structures 200 used by aspanning storage interface according to an embodiment of the invention.An embodiment of spanning storage interface 200 includes both memory205, which has high performance but relatively low capacity, and diskstorage 210, which has high capacity but relatively low performance.

Memory 205 includes a slab cache data structure 215. The slab cache 215is adapted to store a set of labels 220 and a corresponding set of datasegments 225. In typical applications, the sets of labels 220 and datasegments 225 stored in the slab cache 215 represent only a smallfraction of the total number of data segments and labels used torepresent stored data. A complete set of the labels and data segments isstored in disk storage 210.

An embodiment of the slab cache 215 also includes segment metadata 230,which specifies characteristics of the data segments 225. In anembodiment, the segment metadata 230 includes the lengths of the datasegments 225; hashes or other characterizations of the contents of thedata segments 225; and/or anchor indicators, which indicate whether aparticular data segment has been designated as a representative exampleof the contents of a data segment slab file, as discussed in detailbelow.

An embodiment of the slab cache 215 also includes data segment referencecount values. The spanning storage interface 200 recognizes that somedata segments are used in multiple places in one or more data streams.For at least some of the data segments, an embodiment of the spanningstorage interface 200 maintains counts, referred to as reference counts,of the number of times these data segments are used. As discussed indetail below, if a data stream includes a data segment previouslydefined, an embodiment of the spanning storage interface 200 mayincrement the reference count value associated with this data segment.Conversely, if a data stream is deleted from the spanning storageinterface 200, an embodiment of the spanning storage interface 200 maydecrement the reference count values associated with the data segmentsincluded in the deleted data stream. If the reference count value of adata segment drops to zero, the data segment and label may be deletedand its storage space reallocated.

In addition to the slab cache 215, an embodiment of the spanning storageinterface 200 includes a reverse map cache 240. In an embodiment, thereverse map cache 240 maps the contents of a data segment to a label,for the labels stored in the slab cache 215. In an embodiment, a hashingor other data characterization technique is applied to segment data. Theresulting value is used as an index in the reverse map cache 240 toidentify an associated label in the slab cache 215. If the hash or othervalue derived from the segment data matches an entry in the reverse mapcache 240, then this data segment has been previously defined and isstored in the slab cache 215. If the hash or other value derived fromthe segment data does not match any entry in the reverse map cache 240,then this data segment is not currently stored in the slab cache 215.Because the slab cache 215 only includes a portion of the total numberof labels used to represent data segments, a data segment that does notmatch a reverse map cache entry may either have not been previouslydefined or may have been previously defined but not loaded into the slabcache 215.

In an embodiment, memory 205 of the spanning storage interface 200 alsoincludes an anchor cache 245. Anchor cache 245 is similar to reverse mapcache 240; however, anchor cache 245 matches the contents of datasegments with representative data segments in data segment slab filesstored on disk storage 210. A complete set of data segments are storedin one or more data segment slab files in disk storage 210. In anembodiment, one or more representative data segments from each datasegment slab file are selected by the spanning storage interface 200.The spanning storage interface 200 determines hash or other datacharacterization values for these selected representative data segmentsand stores these values along with data identifying the file or diskstorage location including this data segment in the anchor cache 245. Inan embodiment, the data identifying the file or disk storage location ofa representative data segment may be its associated label. The spanningstorage interface 200 uses the anchor cache 245 to determine if a datasegment from a data stream matches a data segment from another datastream previously stored in disk storage but not currently stored in theslab cache.

In an embodiment, potential representative data segments are identifiedduring segmentation of a data stream. As discussed in detail below, whenone or more potential representative data segments are later stored indisk storage 210, for example in a data segment slab file, an embodimentof the spanning storage interface 200 selects one or more of thesepotential representative data segments for inclusion in the anchorcache.

A variety of criteria and types of analysis may be used alone ortogether in various combinations to identify representative datasegments in data streams and/or in data segment slab files stored indisk storage 210. For example, the spanning storage interface 200selects the first unique data segment in a data stream as arepresentative data segment. In another example, the spanning storageinterface 200 uses the content of the data stream to identify potentialrepresentative data segments. In still another example, the spanningstorage interface 200 uses criteria based on metadata such as a filetype, data type, or other attributes provided with a data stream toidentify potential representative data segments. For example, datasegments including specific sequences of data and/or located at specificlocations within a data stream of a given type may be designated asrepresentative data segments based on criteria or heuristics used by thespanning storage interface 200. In a further example, a random selectionof unique segments in a data stream or a data segment slab file may bedesignated as representative data segments. In yet a further example,representative data segments may be selected at specific locations ofdata segment slab files, such as the middle data segment in a slab file.

Disk storage 210 stores a complete set of data segments and associatedlabels used to represent all of the data streams stored by spanningstorage interface 200. In an embodiment, disk storage 210 may becomprised of multiple physical and/or logical storage devices. In afurther embodiment, disk storage 210 may be implementing using a storagearea network.

Disk storage 210 includes one or more data segment slab files 250. Eachdata segment slab file 250 includes a segment index 255 and a set ofdata segments 265. The segment index 255 specifies the location of eachdata segment within the data segment slab file. Data segment slab file250 also includes segment metadata 260, similar to the segment metadata230 discussed above. In an embodiment, segment metadata 260 in the datasegment slab file 250 is a subset of the segment metadata in the slabcache 215 to improve compression performance. In this embodiment, thespanning storage interface 200 may recompute or recreate the remainingmetadata attribute values for data segments upon transferring datasegments into the slab cache 215.

Additionally, data segment slab file 250 may include data segmentreference count values 270 for some or all of the data segments 265. Inan embodiment, slab file 250 may include slab file metadata 275, such asa list of data segments to be deleted from the slab file 250.

Disk storage 210 includes one or more label map container files 280.Each label map container file 280 includes one or more label maps 290.Each of the label maps 290 corresponds with all or a portion of adeduplicated data stream stored by the spanning storage interface 200.Each of the label maps 290 includes a sequence of one or more labelscorresponding with the sequence of data segments in all or a portion ofa deduplicated data stream. In an embodiment, each label map alsoincludes a label map table of contents providing the offset or relativeposition of sections of the label map sequence with respect to theoriginal data stream. In one implementation, the label maps arecompressed in sections, and the label map table of contents providesoffsets or relative locations of sections of the label map sequencerelative to the uncompressed data stream. The label map table ofcontents may be used to allow random or non-sequential access to adeduplicated data stream.

Additionally, label map container file 280 may include label mapcontainer index 285 that specifies the location of each label map withinthe label map container file.

In an embodiment, label names are used not only identify data segments,but also to locate data segments and their containing data segment slabfiles. For example, labels may be assigned to data segments duringsegmentation. Each label name may include a prefix portion and a suffixportion. The prefix portion of the label name may correspond with thefile system path and/or file name of the data segment slab file used tostore its associated segment. All of the data segments associated withthe same label prefix may be stored in the same data segment slab file.The suffix portion of the label name may be used to specify the locationof the data segment within its data segment slab file. The suffixportion of the label name may be used directly as an index or locationvalue of its data segment or indirectly in conjunction with segmentindex data in the slab file. In this implementation, the complete labelname associated with a data segment does not need to be stored in theslab file. Instead, the label name is represented implicitly by thestorage location of the slab file and the data segment within the slabfile. In a further embodiment, label names are assigned sequentially inone or more namespaces or sequences to facilitate this usage.

An embodiment similarly uses data stream identifiers to not onlyidentify deduplicated data streams but to locate label maps and theircontaining label map containers. For example, a data stream identifieris assigned to a data stream during deduplication. Each data streamidentifier name may include a prefix portion and a suffix portion. Theprefix portion of the data stream identifier may correspond with thefile system path and/or file name of the label map container used tostore the label map representing the data stream. The suffix portion ofthe data stream identifier may be used to directly or indirectly specifythe location of the label map within its label map container file. In afurther embodiment, data stream identifiers are assigned sequentially inone or more namespaces or sequences to facilitate this usage.

Embodiments of the spanning storage interface 200 may specify the sizes,location, alignment, and optionally padding of data in data segment slabfiles 250 and label map container files 280 to optimize the performanceof disk storage 210. For example, segment reference counts arefrequently updated, so these may be located at the end of the datasegment slab file 250 to improve update performance. In another example,data segments may be sized and aligned according to the sizes andboundaries of clusters or blocks in the disk storage 210 to improveaccess performance and reduce wasted storage space.

FIG. 3A illustrates a method 300 of converting a data stream intodeduplicated data according to an embodiment of the invention. Anembodiment of method 300 may be executed at least in part by adeduplication module including in a spanning storage interface.

Step 305 receives all or a portion of a data stream. The data stream maybe any type or format of data, including files and objects. In anembodiment, a deduplicating storage interface client provides the datastream to the spanning storage interface.

Step 310 uses a segmentation technique to generate one or more datasegments from the data stream or portion thereof received by step 305.

Step 315 determines if any of the generated data segments are referencedby the anchor cache of the spanning storage interface. In an embodiment,step 315 compares a hash or other characterization of the contents ofeach of the data segments with entries of the anchor cache. If the hashof the data segment matches an entry of the anchor cache, then the datasegment is referenced by the anchor cache. In a further embodiment, ifthe hash of a data segment matches an entry of the anchor cache, step315 then compares the segment length and/or the contents of the datasegment with the corresponding data segment stored in a slab file toverify that the data segment from the data stream and the previouslygenerated instance of the data segment are identical.

In an embodiment, a copy of only a portion of the data segments used fordata deduplication are stored locally. The full and authoritative set ofdata segments is stored in one or more slab files stored in the cloudstorage. Because the cloud storage is accessed via a wide-area network,there are often substantial bandwidth and latency restrictions onaccessing slab files from cloud storage. In an embodiment, if a datasegment from the data stream matches an entry from the anchor cache,step 315 selects the slab file associated with this anchor cache entryfor processing by method 355, as discussed below. In an embodiment,method 355 may retrieve one or more slab files selected by step 315 fromthe cloud storage in parallel and/or asynchronously with the executionof method 300.

Step 325 determines if any of the data segments generated in step 310match a data segment referenced by the reverse map in memory. In anembodiment, step 325 is similar to step 315. Step 325 compares a hash orother characterization of the contents of the data segment with entriesof the reverse map. In a further embodiment, if the hash of the datasegment matches an entry of the reverse map (and/or previously matchedan entry of the anchor cache), step 325 also compares the segment lengthand/or the contents of the data segment with the corresponding datasegment stored in the slab cache to verify that the data segment fromthe data stream and the cached data segment are identical.

For each of the data segments from the data stream that match previouslygenerated data segments in the slab cache, step 325 associates thesedata segments from the data stream with the labels assigned to theircounterparts in the slab cache. Step 330 increments the reference countsfor these labels based on the number of instances of their associateddata segment in the data stream. For example, step 330 increments thereference count by one for each instance of the generated data segmentin the data stream.

Conversely, if one or more the data segments from the data stream arenot referenced by the reverse map, then step 335 assigns new labels tothese newly generated data segments. These new labels assigned by step335 are referred to as provisional labels. As discussed below in method355, method 350 may replace provisional labels assigned by step 335 withpreviously generated labels corresponding with identical data segmentsin slab files retrieved from the cloud storage. Step 335 then adds thenew data segments and their assigned provisional labels to the slabcache in memory. For each newly added data segment and provisionallabel, step 335 generates segment metadata adds it to the slab cache.Step 335 also initializes a reference count in the slab cache for eachof the newly added data segments, setting each newly added provisionallabel's reference count to correspond with the number of currently knowninstances of the corresponding data segment in the data stream. Forexample, step 335 may initialize a reference count associated with a newprovisional label and data segment to one, if the data segment occursonly once in the data stream or portion thereof received by step 305. Inanother example, step 335 may initialize the reference count associatedwith a new provisional label and data segment to a number greater thanone of this data segment is used multiple times in the received portionof the data stream. Step 335 also adds the new provisional labels andhashes or other data characterizations of the new data segment to thereverse map in memory.

Following steps 330 or 335, the slab cache in memory has been updatedwith all of the data segments generated by step 310 from the receivedportion of the data stream, either by incrementing the reference countsof previously generated labels or adding new provisional labels andassociated data segments to the slab cache. In a further embodiment, theupdates to the slab cache in memory are stored in local disk storage forfurther processing and eventual copying to the cloud storage. In anembodiment, method 300 stores a copy of any new data segments andassociated metadata in local disk storage in one or more new slab files.Additionally, any changes to previously-generated data segment metadata,such as updates in reference counts, may be stored in local storage aswell.

Step 340 adds the sequence of labels associated with the data segmentsgenerated by step 310 to a label map. The sequence of labels may includeboth previously generated labels and/or provisional labels, dependingupon the contents of the current data stream and any previouslyprocessed data streams. Step 340 adds labels to the label map in thesame sequence as their corresponding data segments are found in the datastream.

Decision block 345 determines if all of the data in the data stream hasbeen processed by steps 310 to 340. If all of the data in the datastream has not been processed, method 300 returns to step 305 to receiveanother portion of the data stream and to generate and processadditional data segments.

If all of the data stream has been processed, method 300 proceeds tostep 350. Step 350 adds the completed label map to a label map containerfile in the local disk storage. Step 350 assigns the data stream and itscorresponding label map a data stream identifier. In an embodiment, thedata stream identifier specifies the identity and/or the location of thelabel map container file in the disk storage. Step 350 may store thedata stream identifier in the metadata of the corresponding file in theshell file system, such as in a reparse point in an NTFS file system ora extended attribute in an ext3 file system. Following step 350, thespanning storage interface 125 may delete the original data stream frommemory or disk storage, as this data stream is now stored indeduplicated form by the spanning storage interface.

FIG. 3B illustrates a method 350 for transferring deduplicated data froma spanning storage interface to cloud storage. An embodiment of method350 may be executed by a replication module operating in parallel and/orasynchronously with a deduplication module. As described above, anembodiment of the spanning storage interface includes a local copy ofonly a portion of the data segments used for data deduplication. Thefull and authoritative set of data segments is stored in one or moreslab files stored in the cloud storage. Thus, this embodiment of thespanning storage interface should copy any newly added data segments orupdated segment metadata to the cloud storage as soon as possible, sothat the cloud storage includes a complete and authoritative set of thedata segments, associated labels, and label metadata, such as referencecounts.

In an embodiment, a complete set of slab files, including at least allof the data segments used to store a deduplicated version of theclient's data, is stored in cloud storage. If step 315 in method 300matches a data segment to an entry of the anchor cache, then the data ofthis segment has been previously associated with a label. To optimizethe data deduplication, this previously associated label should beassociated with the new data segment. Additionally, because the anchorcache only includes a representative sample of data segments in the slabfile, it is likely that other data segments in the slab file associatedwith the matching anchor cache entry may also match other recentlyreceived data segments. Thus, step 355 retrieves one or more slab filespreviously selected for retrieval by step 315 in method 300.

In an embodiment, step 355 retrieves one or more previously selectedslab files from cloud storage via the wide-area network. In anembodiment, step 355 uses the label name of the matching anchor cacheentry to identify and optionally locate the data segment slab fileincluding the previously generated instance of the data segment. In afurther embodiment, copies of some of the slab files may be storedlocally. In this embodiment, step 355 determines if any of the selectedslab files have local copies. Step 355 then retrieves any selected slabfiles that do not have copies stored locally from the cloud storage.

Step 360 processes the selected and retrieved slab files. In anembodiment, step 360 retrieves all of the data segments included in thisdata segment slab file from disk storage and adds them to the slab cachein memory. Step 360 also retrieves and/or regenerates the labels andsegment metadata for these data segments and adds these to the slabcache. Step 360 retrieves the segment reference counts for these datasegments from the data segment slab file and adds these to the slabcache in memory. Step 360 also updates the reverse map cache with thelabels and hashes or other data characterizations of the retrieved datasegments.

In method 300, data segments that do not match reverse map cache entriesare assigned provisional labels. Data segments assigned provisionallabels may include data segments matching an anchor cache entry as wellas data segments that do not match any anchor cache entries. Step 365identifies the provisional labels, if any, in one or more newly createdlabel maps and/or label map container files.

Step 370 compares the data segments associated with the provisionallabels with the updated reverse map cache. Step 370 ignores the reversemap cache entries associated with provisional labels in this comparison;instead, step 370 determines if any provisionally labeled data segmentsare identical to previously generated data segments. In an embodiment,step 370 compares a hash or other characterization of the contents ofthese provisionally labeled data segments with the non-provisionalentries of the reverse map cache, which are cache entries that are notassociated with provisional labels. In a further embodiment, if the hashof the data segment matches an entry of the reverse map, step 370 alsocompares the segment lengths and/or the contents of these provisionallylabeled data segments with the corresponding non-provisional datasegments stored in the slab cache to verify that the data segment fromthe data stream and the cached data segment are identical.

For data segments that do not match cached data segments in the slabcache, an embodiment of step 375 may change their associated labels tonon-provisional status. An embodiment of step 375 may update the labelmap, label map container file, slab file, slab cache and/or reverse mapcache with this change in status.

For data segments that do not match cached data segments in the slabcache, an embodiment of step 380 replaces the associated provisionallabels in label maps with the matching non-provisional labels. As aresult of step 380, a provisional label referencing a recently createddata segment is replaced with a non-provisional label referencing apreviously generated segment. However, no data is lost by step 380,because the contents of the provisional data segment are identical tothe previously generated non-provisional data segment, as determined bystep 375.

Step 385 removes data segments and discards data segments associatedwith provisional labels that match previously generated non-provisionallabels. In an embodiment, step 385 removes these provisional datasegments from a slab file stored locally by a spanning storageinterface. In a further embodiment, step 385 removes the provisionaldata segment and its associated provisional label from the slab cacheand reverse map, respectively. These provisional labels and datasegments may be removed because they are duplicative of previouslygenerated data segments and labels. In an embodiment, step 385 updatesthe previously generated non-provisional label and data segmentmetadata. For example, if a provisional label is associated with areference count, which indicates how many times this provisional labelis used in one or more label maps; then step 385 may add this referencecount to the reference count of the matching previously-generatednon-provisional label. As a result, the reference count of thisnon-provisional label will be equal to the number of total numberinstances of this segment data, regardless of whether these instanceswere previously associated with the provisional label or thenon-provisional label.

Step 390 identifies changes in the locally stored label map containerfiles and slab files in comparison with their counterparts (if any)stored in the could storage. The changes identified by step 390 mayinclude new label map container files and new slab files, as well asmodified versions of label map container files and slab files previouslystored in cloud storage. Step 395 transfers the new and changed labelmap container files and slab files to the cloud storage. In anembodiment, step 395 only communicates the changed or new data to thecloud storage.

Following step 395, the cloud storage includes a complete andauthoritative version of the label maps and data segments. Thus, theslab files and label map container files stored in the cloud storage maybe used to reconstruct any or all of the data previously stored by theclients via the spanning storage interface. In a further embodiment,step 395 may use atomic operations to update or add label map containerand slab files in the cloud storage. In this embodiment, new and changeddata is first uploaded to the cloud storage and then committed. If thetransfer of data is interrupted before the commitment, for example dueto a system or network failure, the previous versions of the label mapcontainer and slab files stored in the cloud storage will not becorrupted and may be used to restore client data at the same or adifferent location. This allows the spanning storage interface to usecloud storage as a deduplicated disaster data recovery facility.

Following step 395, the spanning storage interface may delete some orall of the local copies of slab files and label map container files. Ina further embodiment, the spanning storage interface may maintain localcopies of some or all of the slab files and label map container filesfor the purpose of caching. The local caching may use the local storageassociated with the spanning storage interface. The spanning storageinterface may cache data in its deduplicated format to reduce localstorage requirements or increase the effective cache size. In thisembodiment, the spanning storage interface may use a variety of criteriafor selecting portions of the deduplicated client data for caching. Forexample, if the spanning storage interface is used for general filestorage or as a cloud storage interface, the spanning storage interfacemay select a specific amount or percentage of the client data for localcaching. In another example, the data selected for local caching may bebased on usage patterns of client data, such as frequently or recentlyused data. Caching criteria may be based on elapsed time and/or the typeof data. In another example, the spanning storage interface may maintainlocally cached copies of the most recent data backups from clients, suchas the most recent full backup and the previous week's incrementalbackups.

FIG. 4 illustrates a method 400 of retrieving an original data streamfrom deduplicated data according to an embodiment of the invention. Inan embodiment, step 405 receives a data access request from a client.

Step 410 identifies a label map associated with the requested data. Forexample, if the data access request is for a file in the shell filesystem, an embodiment of step 410 retrieves a data stream identifierfrom the metadata of this shell file. Step 410 then retrieves the labelmap associated with the data stream identifier from memory, diskstorage, or cloud storage. The label map includes a sequence of labelscorresponding with a sequence of data segments representing the datastream. In an embodiment, the data stream identifier specifies theidentity and/or the location of the label map container file in the diskor cloud storage. For example, a prefix portion of the data streamidentifier may correspond with the file system path and/or file name orcloud data identifier of the label map container file used to store thelabel map representing the data stream. A suffix portion of the datastream identifier may be used to directly or indirectly specify thelocation of the label map within its label map container file.

Upon retrieving the label map associated with the data streamidentifier, step 415 selects the next label in sequence in the labelmap. In an embodiment, method 400 may receive the data stream identifierwith a request for the entire data stream. In this embodiment, the firstiteration of step 415 selects the first label in the label map.

In another embodiment, method 400 may receive a data stream identifierwith a request for only a portion of the data stream. In thisembodiment, step 415 selects the first label corresponding with thebeginning of the requested portion of the data stream. In an embodiment,each label map includes a label map table of contents providing theoffset or relative position of each instance of a label with respect tothe original data stream. The label map table of contents may be used toallow random or non-sequential access to a deduplicated data stream. Inan embodiment, the requested portion of the data stream is specifiedwith a starting data stream address or offset and/or an ending datastream offset or address. Step 415 uses this label map table of contentsto identify the label corresponding with the starting data streamaddress or offset.

Decision block 420 determines if the data segment corresponding with theselected label is already stored in the slab cache in memory. In anembodiment, decision block 420 searches for the selected label in theslab cache to make this determination. If the data segment correspondingwith the selected label is already stored in the slab cache in memory,then method 400 proceeds to step 430.

Conversely, if the data segment corresponding with the selected label isnot stored in the slab cache in memory, step 425 accesses a slab datafile including a previously generated instance of the data segmentcorresponding with the selected label. In an embodiment, step 425 usesthe label name to identify and optionally locate the data segment slabfile including the previously generated instance of the data segment.Step 425 may retrieve the slab file from cloud storage. In a furtherembodiment, step 425 first checks to see if the required slab file iscached locally by the spanning storage interface; if so, then step 425retrieves the data segment from the local copy of the slab file, ratherthan from the cloud storage.

Step 425 retrieves at least the data segment corresponding with theselected label from its data segment slab file and adds it to the slabcache in memory. In an embodiment, step 425 retrieves all of the datasegments included in this data segment slab file from local storage orcloud storage and adds them to the slab cache in memory. Step 425 alsoretrieves and/or generates the labels and segment metadata for theretrieved data segments and adds these to the slab cache. Step 425retrieves the segment reference counts for these data segments from thedata segment slab file and adds these to the slab cache in memory. Step425 also updates the reverse map cache with the labels and hashes orother data characterizations of the retrieved data segments.

Step 430 retrieves the data segment corresponding with the selectedlabel from the slab cache. Step 435 adds all or a portion of this datasegment to a data stream buffer or other data structure used toreconstruct the requested data stream. In an embodiment, steps 430 and435 decompress the contents of the data segment prior to adding it tothe data stream buffer.

In another embodiment, data segments are decompressed upon beinginitially added to the slab cache. In still another embodiment, one ormore data segments are decompressed after being added to the data streambuffer.

In an embodiment, method 400 may receive a request for only a portion ofthe data stream. In this embodiment, step 435 may need to remove thebeginning of a data segment if the data segment is the first datasegment in the requested portion of the data stream, such that thebeginning of the data stream buffer matches the beginning of therequested portion of the data stream. Similarly, step 435 may need toremove the end of a data segment if the data segment is the last datasegment in the requested portion of the data stream, such that the endof the data stream buffer matches the end of the requested portion ofthe data stream.

Decision block 440 determines if all of the labels corresponding withthe requested data in the data stream have been processed by steps 410to 435. If all of the labels corresponding with the requested data inthe data stream have not been processed, method 400 returns to step 415to process additional labels from the label map associated with the datastream.

Once all of the labels associated with the requested portion of the datastream have been processed, method 400 proceeds to step 445. Step 445returns the data stream to the deduplicating storage interface client orother entity providing the data stream. Embodiments of method 400 mayoutput the data stream in its entirety in step 445 or output portions ofthe requested portion of the data stream in step 445 in parallel withperforming the other steps of method 400 to reconstruct other portionsof the requested portion of the data stream. For example, step 425 maybe performed asynchronously with other steps of method 400 so that slabfiles may be retrieved from the cloud storage in the background whilethe spanning storage interface processes other labels in the label map.

FIG. 5 illustrates a method 500 of deleting a data stream from aspanning storage interface according to an embodiment of the invention.In an embodiment, step 505 receives a data stream identifier from adeduplicating storage interface client.

Step 510 retrieves the label map associated with the data streamidentifier from memory or disk storage. The label map includes asequence of labels corresponding with a sequence of data segmentsrepresenting the data stream. In an embodiment, the data streamidentifier specifies the identity and/or the location of the label mapcontainer file in the disk storage. For example, a prefix portion of thedata stream identifier may correspond with the file system path and/orfile name of the label map container used to store the label maprepresenting the data stream. A suffix portion of the data streamidentifier may be used to directly or indirectly specify the location ofthe label map within its label map container file.

Upon retrieving the label map associated with the data streamidentifier, step 515 selects the next label in sequence in the labelmap. In an embodiment, the first iteration of step 515 selects the firstlabel in the label map.

Decision block 520 determines if the data segment corresponding with theselected label is already stored in the slab cache in memory. In anembodiment, decision block 520 searches for the selected label in theslab cache to make this determination. If the data segment correspondingwith the selected label is already stored in the slab cache in memory,then method 500 proceeds to step 530.

Conversely, if the data segment corresponding with the selected label isnot stored in the slab cache in memory, step 525 accesses a slab datafile including a previously generated instance of the data segmentcorresponding with the selected label. In an embodiment, step 525 usesthe label name to identify and optionally locate the data segment slabfile including the previously generated instance of the data segment.

Step 525 retrieves at least the data segment corresponding with theselected label from its data segment slab file and adds it to the slabcache in memory. In an embodiment, step 525 retrieves all of the datasegments included in this data segment slab file from disk storage orcloud storage and adds them to the slab cache in memory. Step 525 alsoretrieves and/or generates the labels and segment metadata for theretrieved data segments and adds these to the slab cache. Step 525retrieves the segment reference counts for these data segments from thedata segment slab file and adds these to the slab cache in memory. Step525 also updates the reverse map cache with the labels and hashes orother data characterizations of the retrieved data segments.

Step 530 decrements the reference count in the slab cache associatedwith the selected label. In an embodiment, if the reference count of alabel is decremented to zero, then the label and its data segment aremarked for deletion from the slab cache and its data segment slab file.

Decision block 535 determines if all of the labels in the label map havebeen processed by steps 510 to 530. If all of the labels correspondingwith the requested data in the data stream have not been processed,method 500 returns to step 515 to process additional labels from thelabel map associated with the data stream.

Once all of the labels associated with the label map have beenprocessed, method 500 proceeds to step 540. Step 540 updates the datasegment slab files including any data segments affected by the deletionoperation. In an embodiment, step 540 writes the updated and decrementedreference counts for data segments associated with the label map back totheir respective data segment slab files. In an embodiment, if thereference count of a data segment has been decremented to zero, anembodiment of step 540 marks this data segment for deletion from thedata segment slab file. In a further embodiment, a garbage collectionprocess removes unneeded data segments and associated reference countsand segment metadata from data segment slab files. An embodiment of step540 transfers the updated slab files to the cloud storage.

Step 545 updates the label map container file to remove the label mapassociated with the data stream identifier. In an embodiment, if thedisk storage supports sparse files, the label map may be deleteddirectly without rewriting the label map container file. In anotherembodiment, if sparse files are not supported by the disk storage, thenunneeded label maps are marked for deletion. A garbage collectionprocess, similar to that used by embodiments of step 540, may be used toremove unnecessary label maps by rewriting label map container fileswhen the number or proportion of label maps marked for deletion exceedsa threshold. An embodiment of step 545 transfers the updated label mapcontainer files to the cloud storage.

In an embodiment, steps 525, 540, and 545 may perform transfers to andfrom the cloud storage via the wide-area network in parallel and/orasynchronously with other steps of method 500. Similarly to step 390above, steps 540 and 545 may identify changes in the locally storedlabel map container files and slab files in comparison with theircounterparts (if any) stored in the could storage. Steps 540 and 545transfer the changed label map container files and slab files to thecloud storage. In an embodiment, steps 540 and 545 only communicates thechanged or new data to the cloud storage.

Embodiments of method 500 may return a deletion confirmation to thededuplicating storage interface client or other entity. In oneembodiment, the deletion confirmation is provided following thesuccessful retrieval of the label map corresponding with the data streamidentifier in step 510. The remainder of method 500 may be performed asa background or low priority process by the deduplication and/orreplication modules without impacting the performance of the client. Inanother embodiment, the deletion confirmation is returned to the clientfollowing the completion of method 500.

A further embodiment of method 500 may allow for deletion of a specifiedportion of data from a data stream. In this embodiment, for datasegments that are partially contained within the specified portion ofthe data stream, the data from these data segments is retrieved andtruncated so that only data outside of the specified portion of the datastream remains. This modified data is then re-encoded as one or morerevised data segments and corresponding labels, which may be new to thespanning storage interface or may match previously created datasegments, as described above. The labels representing data segmentscontained wholly or partially within the specified portion of a datastream are removed from the label map. The reference counts of thesedata segments are updated accordingly. The label map is rewritten toremove unused labels and to add labels for revised data segments.

In an embodiment, one or more garbage collection processes removesunneeded data segments, labels, and metadata from caches and files.Embodiments of the garbage collection process or processes may beperformed independently of the above methods, for example as abackground or low-priority processes. Alternatively, some or all of thegarbage collection processes may be performed as part of the abovemethods in creating or updating the slab and/or label map containerfiles on disk storage and/or the slab cache and anchor caches in memory.

For example, a garbage collection process may remove unneeded datasegments and associated reference counts and segment metadata from thedata segment slab files. In an embodiment, the garbage collectionprocess determines if the number or proportion of data segments markedfor deletion in a data segment slab file exceeds a threshold. If thisthreshold is exceeded, then the entire data segment slab file isrewritten, with the data segments marked for deletion omitted from therewritten data segment slab file.

In another example, a garbage collection process removes labels from theanchor cache after the corresponding data segments have been loaded intothe slab cache. In an embodiment, a garbage collection process useslabel metadata attributes to identify labels in the slab cachecorresponding with representative data segments and then compares theseidentified labels with the labels in the anchor cache. If a label in theanchor cache matches a label in the slab cache, the garbage collectionprocess removes this label from the anchor cache, as this data segmentis now loaded into memory in the slab cache.

In many applications, some data segments may be used more frequentlythan other data segments. Typical frequently-used data segments caninclude data corresponding to repeating data patterns, such as datasegments consisting entirely of null values or other data or file-formatspecific motifs.

To improve performance, an embodiment of the deduplicating data storagesystem stores frequently-used data segments separately from less-useddata segments. In an embodiment, the deduplicating data storage systemmonitors the reference counts associated with data segments. When thereference count of a data segment is increased above a threshold value,that data segment is designated as a frequently-used data segment. Anembodiment moves or copies this data segment to separate slab filereserved for frequently-used data segments. The frequently-used datasegment is relabeled as it is transferred to the frequently-used datasegment slab file.

In an embodiment, the frequently-used data segment slab file is similarto other data segment slab files, such as data segment slab file 250discussed above. In still a further embodiment, data segment referencecounts are not maintained or updated for frequently-used data segments;accordingly, data segment reference counts may be omitted from thefrequently-used data segment slab file.

Embodiments of the invention may store frequently-used data segments inmemory for improved performance using a variety of different techniques.In a first embodiment, all of the frequently-used data segments andtheir associated labels and metadata from one or more frequently-useddata segment slab files may be loaded into the slab cache or a separatefrequently-used data segment cache during the initialization of thededuplication data storage system. In another embodiment, hashes orother data characterizations of all of the frequently-used data segmentsand their associated labels from one or more frequently-used datasegment slab files are initially loaded into the anchor cache or aseparate, similar cache. In this embodiment, the data associated with afrequently-used data segment is loaded into the slab cache as needed, ina similar manner as with other data segments as described above.

In an embodiment, frequently-used data segments stored in the slab cacheare accessed for deduplicating additional data streams and retrievingdeduplicated data in a similar manner as other data segments, asdescribed above. However, in an embodiment, data segment referencecounts are not maintained or updated in memory for frequently-used datasegments. Therefore, an embodiment of the deduplicating data storagesystem does not increment an associated data segment reference countwhen a frequently-used data segment is used to deduplicate an additionaldata stream and does not decrement an associated data segment referencecount when a data stream including a frequently-used data segment isdeleted.

Embodiments of the deduplicating data storage system may be used in avariety of data storage applications to store files, objects, databases,or any other type or arrangement of data in a deduplicated form.

FIG. 6 illustrates a computer system suitable for implementingembodiments of the invention. FIG. 6 is a block diagram of a computersystem 2000, such as a personal computer or other digital device,suitable for practicing an embodiment of the invention. Embodiments ofcomputer system 2000 may include dedicated networking devices, such aswireless access points, network switches, hubs, routers, hardwarefirewalls, WAN and LAN network traffic optimizers and accelerators,network attached storage devices, storage array network interfaces, andcombinations thereof.

Computer system 2000 includes a central processing unit (CPU) 2005 forrunning software applications and optionally an operating system. CPU2005 may be comprised of one or more processing cores. Memory 2010stores applications and data for use by the CPU 2005. Examples of memory2010 include dynamic and static random access memory. Storage 2015provides non-volatile storage for applications and data and may includefixed or removable hard disk drives, flash memory devices, ROM memory,and CD-ROM, DVD-ROM, Blu-ray, HD-DVD, or other magnetic, optical, orsolid state storage devices.

In a further embodiment, CPU 2005 may execute virtual machine softwareapplications to create one or more virtual processors capable ofexecuting additional software applications and optional additionaloperating systems. Virtual machine applications can includeinterpreters, recompilers, and just-in-time compilers to assist inexecuting software applications within virtual machines. Additionally,one or more CPUs 2005 or associated processing cores can includevirtualization specific hardware, such as additional register sets,memory address manipulation hardware, additional virtualization-specificprocessor instructions, and virtual machine state maintenance andmigration hardware.

Optional user input devices 2020 communicate user inputs from one ormore users to the computer system 2000, examples of which may includekeyboards, mice, joysticks, digitizer tablets, touch pads, touchscreens, still or video cameras, and/or microphones. In an embodiment,user input devices may be omitted and computer system 2000 may present auser interface to a user over a network, for example using a web page ornetwork management protocol and network management softwareapplications.

Computer system 2000 includes one or more network interfaces 2025 thatallow computer system 2000 to communicate with other computer systemsvia an electronic communications network, and may include wired orwireless communication over local area networks and wide area networkssuch as the Internet. Computer system 2000 may support a variety ofnetworking protocols at one or more levels of abstraction. For example,computer system may support networking protocols at one or more layersof the seven layer OSI network model. An embodiment of network interface2025 includes one or more wireless network interfaces adapted tocommunicate with wireless clients and with other wireless networkingdevices using radio waves, for example using the 802.11 family ofprotocols, such as 802.11a, 802.11b, 802.11g, and 802.11n.

An embodiment of the computer system 2000 may also include one or morewired networking interfaces, such as one or more Ethernet connections tocommunicate with other networking devices via local or wide-areanetworks.

The components of computer system 2000, including CPU 2005, memory 2010,data storage 2015, user input devices 2020, and network interface 2025are connected via one or more data buses 2060. Additionally, some or allof the components of computer system 2000, including CPU 2005, memory2010, data storage 2015, user input devices 2020, and network interface2025 may be integrated together into one or more integrated circuits orintegrated circuit packages. Furthermore, some or all of the componentsof computer system 2000 may be implemented as application specificintegrated circuits (ASICS) and/or programmable logic.

FIG. 7 illustrates an example disaster recovery application 700 of aspanning storage interface according to an embodiment of the invention.Disaster recovery application 700 may be used to provide redundant dataaccess to storage clients in the event that the storage clients and/orcloud spanning storage interface at a first network location aredisabled, destroyed, or otherwise inaccessible or inoperable.

In example disaster recovery application 700, a first network location A705 includes a first spanning storage interface 710. Spanning storageinterface 710 provides storage access to one or more storage clients,such as storage client 720A and backup server 720B, via a local areanetwork and/or a storage area network. Spanning storage interface 710deduplicates data received from storage clients and transfers thededuplicated data via the wide area network 780 to one or more cloudstorage services, such as cloud storage services 770 and 775, forstorage. The spanning storage interface 710 may also retrievededuplicated data via the wide area network 780 from one or more cloudstorage services and reconstruct this data in its original form toprovide to storage clients.

As discussed above, the spanning storage interface 710 includes localstorage 715 to improve data access performance. Local storage 715includes a local cache A 725 of a portion of the storage data providedby storage clients at network location A 705.

To provide disaster recovery, example application 700 includes a secondnetwork location B 735. Network location B 735 includes a secondspanning storage interface 740. Spanning storage interface 740 isprovided for disaster recovery operations and may be used to access thedata associated with the first network location A 705 in the event thatnetwork location A 705 is disabled, destroyed, or otherwise inaccessibleor inoperable.

To provide disaster recovery data access, the second spanning storageinterface 740 can access deduplicated data stored in one or more of thecloud storage services 770 and/or 775 via wide-area network 780. Thesecond spanning storage interface 740 reconstructs the original datafrom the retrieved deduplicated data and provides it to storage clients.

The second spanning storage interface 740 includes local storage B 745for improving data access performance. In an embodiment, a copy 760 ofsome or all or the local cache A 725 used by the first spanning storageinterface 710 is transferred to the local storage B 745 while the firstnetwork location 705 is operating. In the event of a disaster affectingthe first network location 705, the second spanning storage interface740 can provide data access to the first network location's data withthe improved performance benefit provided by the copy of local cache A760 in its local storage B 745.

Network location B 735 may be a dedicated disaster recovery networklocation. Alternatively, network location B may also optionally be usedwith one or more local storage clients, such as storage clients 750A andbackup server 750B. In this further example, the second spanning storageinterface B 740 performs data deduplication and facilitates cloudstorage for data from storage clients 750. Like the first spanningstorage interface 710, the second spanning storage interface B 740 inthis example deduplicates second data received from storage clients atnetwork location B 735 and transfers this second deduplicated data viathe wide area network 780 to one or more cloud storage services, such ascloud storage services 770 and 775, for storage. The second spanningstorage interface 740 may also retrieve second deduplicated data via thewide area network 780 from one or more cloud storage services andreconstruct this second data in its original form to provide to storageclients at the second network location B 735. To improve the performanceof the second spanning storage interface 740, its local storage B 745may include a local cache B 765, which includes a portion of the storagedata provided by storage clients at network location B 735.

In yet a further embodiment, spanning storage interfaces 710 and 740 canoperate in a paired disaster recovery configuration. For example, thesecond spanning storage interface 740 at network location B 735 may actas disaster recovery for the first spanning storage interface 710 at thefirst network location A 705. As described above, the local storage B745 at the second network location B 735 may include a copy 760 of thelocal cache A 725 used by the first spanning storage interface 710. Thecopy 760 of local cache A in local storage B 745 improves the initialperformance of the second spanning storage interface 740 in the eventthat it is required to substitute for the first spanning storageinterface 710.

Similarly, in the paired disaster recovery configuration, first spanningstorage interface 710 may act as disaster recovery for the secondspanning storage interface 740. In the event that the second spanningstorage interface 740 is destroyed, disabled, or otherwise available toits storage clients, the first spanning storage interface 710 mayprovide access to storage data associated with the network location 735.Additionally, the local storage A 715 includes a copy 730 of the localcache B 765 used by the second spanning storage interface 740. The copy730 of the local cache B 765 is transferred to the local storage A 715while the second spanning storage interface 740 is operating. The copiedversion of local cache B 730 in local storage A 715 improves the initialperformance of the first spanning storage interface 710 in the eventthat it is required to substitute for the second spanning storageinterface 740.

In an further embodiment, the paired disaster recovery configuration canbe extended to include additional network locations, with local storageat each network location including a copy of at least one (and possiblymore than one) local cache from other spanning storage interfaces.

In an embodiment, copies of local caches of spanning storage interfacesmay be transferred directly between network locations. For example,spanning storage interfaces at different network locations maycommunicate with each other to transfer and update copies of their localcaches at other network locations. In another embodiment, a spanningstorage interface can retrieve a portion of the deduplicated data from acloud storage service to recreate a copy of a local cache of anotherspanning storage interface.

Further embodiments can be envisioned to one of ordinary skill in theart. In other embodiments, combinations or sub-combinations of the abovedisclosed invention can be advantageously made. The block diagrams ofthe architecture and flow charts are grouped for ease of understanding.However it should be understood that combinations of blocks, additionsof new blocks, re-arrangement of blocks, and the like are contemplatedin alternative embodiments of the present invention.

The specification and drawings are, accordingly, to be regarded in anillustrative rather than a restrictive sense. It will, however, beevident that various modifications and changes may be made thereuntowithout departing from the broader spirit and scope of the invention asset forth in the claims.

1. A local and cloud spanning storage interface comprising: a front-endinterface adapted to communicate with at least one storage client via alocal-area network; a back-end interface adapted to communicate with atleast one cloud storage service via a wide-area network; a datadeduplication module adapted to reduce data redundancy in data receivedfrom the storage client to produce deduplicated data; and a replicationmodule adapted to transfer deduplicated data to the cloud storageservice using the back-end interface.
 2. The local and cloud spanningstorage interface of claim 1, wherein: the data deduplication moduleincludes: slab cache adapted to store a subset of labels and associateddata segments in memory; a reverse map cache adapted to storeassociations in memory between each of the subset of labels in the slabcache with a portion of a data stream; and an anchor cache adapted tostore associations in memory between portions of the data stream and asecond subset of labels and data segments not stored in the slab cache;and a local data storage adapted to store a copy of at least a portionof the data received from the storage client, wherein the local datastorage includes: a copy of a data segment slab file adapted to store innon-volatile storage the set of labels and associated data segments; anda copy of a label map container file adapted to store in non-volatilestorage at least one label map specifying an arrangement of labelscorresponding with the data stream.
 3. The local and cloud spanningstorage interface of claim 2, wherein the data segment slab file and thelabel map container file are stored using the cloud storage service. 4.The local and cloud spanning storage interface of claim 2, wherein thecopy of the data segment slab file and the copy of the label mapcontainer file are selected from a plurality of data segment slab filesand label map container files stored using the cloud storage service. 5.The local and cloud spanning storage interface of claim 4, wherein thecopy of the data segment slab file and the copy of the label mapcontainer file are selected from a plurality of data segment slab filesand label map container files based on a cache criteria.
 6. The localand cloud spanning storage interface of claim 1, comprising: a shellfile system module adapted to present a shell file system representingthe data to the storage client.
 7. The local and cloud spanning storageinterface of claim 1, wherein the front-end interface includes a filesystem interface adapted to communicate with the storage client using afile system protocol.
 8. The local and cloud spanning storage interfaceof claim 1, wherein the front-end interface includes a backup systeminterface adapted to communicate with the storage client using a backupsystem protocol.
 9. The local and cloud spanning storage interface ofclaim 1, wherein the front-end interface includes a cloud storageinterface adapted to communicate with the storage client using a cloudstorage protocol.
 10. The local and cloud spanning storage interface ofclaim 1, wherein the front-end interface includes an archival interfaceadapted to communicate with the storage client using an archivalprotocol.
 11. The local and cloud spanning storage interface of claim 1,wherein the front-end interface includes an object interface adapted tocommunicate with the storage client using a binary large objectprotocol.
 12. The local and cloud spanning storage interface of claim 1,wherein the front-end interface includes a block storage interfaceadapted to communicate with the storage client using a block storageprotocol.
 13. The local and cloud spanning storage interface of claim12, wherein the block storage protocol includes iSCSI.
 14. The local andcloud spanning storage interface of claim 1, wherein the replicationmodule is adapted to communicate a first portion of the deduplicateddata to a first cloud storage service and a second portion of thededuplicated data to a second cloud storage service.
 15. The local andcloud spanning storage interface of claim 14, wherein the first andsecond portions of the deduplicated data determined from a specificationprovided by a user.
 16. The local and cloud spanning storage interfaceof claim 14, wherein the first and second portions of the deduplicateddata are determined at least in part on a file path.
 17. The local andcloud spanning storage interface of claim 1, wherein the replicationmodule is adapted to abandon the transfer of the deduplicated data tothe cloud storage service in response to a quota being exceeded.
 18. Thelocal and cloud spanning storage interface of claim 17, wherein thequota is associated with a storage client.
 19. A method of deduplicatinga data stream, the method comprising: receiving a data stream;generating new data segments from the data stream; determining if thenew data segments match previously generated data segments stored onlyon a cloud storage service; in response to the determination that atleast one of new data segments matches one of the previously generateddata segments stored only on the cloud storage service, retrieving a setof data segments including the matching previously generated datasegment from the cloud storage service; assigning provisional labels toat least a portion of the new data segments not matching locally cacheddata segments; and adding the provisional labels to the label mapassociated with the data stream.
 20. The method of claim 19, comprising:comparing the new data segments assigned to the provisional labels withthe set of data segments retrieved from the cloud storage service; inresponse to the determination that the new data segment matches one ofthe set of data segments, discarding the provisional label assigned tothe new data segment and assigning a previously generated labelassociated with the matching one of the set of data segments to the newdata segment.
 21. The method of claim 19, comprising: determining a setof changes to a locally stored label map container file and a locallystored slab file; and transferring the set of changes to the cloudstorage service to update a full and authoritative set of data segmentsand label maps.
 22. The method of claim 21, wherein transferring the setof changes includes performing an atomic operation to update the fulland authoritative set of data segments and label maps.